A cryo-EM structure of KTF1-bound polymerase V transcription elongation complex

De novo DNA methylation in plants relies on transcription of RNA polymerase V (Pol V) along with KTF1, which produce long non-coding RNAs for recruitment and assembly of the DNA methylation machinery. Here, we report a cryo-EM structure of the Pol V transcription elongation complex bound to KTF1. The structure reveals the conformation of the structural motifs in the active site of Pol V that accounts for its inferior RNA-extension ability. The structure also reveals structural features of Pol V that prevent it from interacting with the transcription factors of Pol II and Pol IV. The KOW5 domain of KTF1 binds near the RNA exit channel of Pol V providing a scaffold for the proposed recruitment of Argonaute proteins to initiate the assembly of the DNA methylation machinery. The structure provides insight into the Pol V transcription elongation process and the role of KTF1 during Pol V transcription-coupled DNA methylation.

this mean exactly -the unique six peptide sequences represented 35 times in total, or are there other peptide sequences that are common among other NRPE5s? 15. Fig. S4G shows overall agreement between the model and map 4 but it is difficult to investigate each residue. The density-model validation for a few key side-chains of NRPE5c-specific residues, in comparison to NRPE5a, could be shown to support the identity of the subunit variant (like Fig. S10 for Pol V-KTF1 interface). 16. page9 lines 8-9: the two Pol V-specific subunits "in Arabidopsis thaliana" should be noted because Pol V subunit composition and usage varies in other plants-for example, Pol IV and Pol V share the same 5th subunit in maize (JR Haag et al. 2014), an NRPE5 unlike NRPB5 of Pol II. 17. NRPD(E)2, NRPE(B/D)3 and NRPE(A/B/C/D)12 subunits are all shared with Pol IV. Would this mean that KTF1-KOW5 can also associate with Pol IV at the same location? If not, what would prevent Pol IV-KTF1 interaction? 18. RDR2 cannot be considered as a "general transcription factor" or a "transcription elongation factor" for Pol IV. It is a partner in a dual polymerase transcription complex. If anything, RDR2 is a Pol IV backtracking factor, not an elongation factor. Please correct these statements. 19. In the section: The KOW5 domain of KTF1 locates at the RNA exit channel of Pol V, page 11, line 11, Fig 4B-4E: How are the H-bonds (dashed lines) defined in these figures? In the model (Pol V_TEC.pdb), one of the highlighted H-bonds between K639 of KTF1-KOW5 and S855 of NRPE(D)2 has a distance of 4.671 Å, which requires relaxing of distance and angle criteria to be detected as a potential H-bond by the hbonds command in UCSF Chimera. In that relaxed condition, several other potential H-bonds (D858 4.044Å and S841 4.825Å) also show up for K639. It is not clear how the K639-S855 H-bond is selected to be shown in Fig. 4E. Including other H-bonds shown in Fig. 4B-4E, please clarify how they were defined and presented. 20. Page 11 line 15; The two-amino acid (SQ) deletion of SPT5 mentioned here is found only in three Brassicaceae (A. thalaina, C. rubella and E. salsugineum) out of 14 plant species in Fig. S11A. Would the Arabidopsis Pol V-KTF1-KOW5 interface be unique in Brassicaceae, and might SPT5 interact with Pol V in other plant species through the SQ residues? 21. It would have been helpful to see if mutations of the KTF1-KOW5 residues involved in Pol Vspecific interaction impact the specific interaction with Pol V, to support the Cryo-EM structure. 22. Fig 5 depicts a short stretch of GW/WG tails of NRPE1 and KTF1 anchored to AGO4 close to Pol V. Both NRPE1-CTD and KTF1-CTD have long intrinsic disordered regions with many GW/WG motifs: 17 motifs in 353aa and 44 motifs in 762aa regions respectively, both extending from near the Pol V RNA exit channel toward upstream dsDNA. Considering the potential long stretch of these tails, AGO4 and DRM2 might be recruited to either one or both long flexible tails in a wide range of locations relative to transcribing Pol V. A suggestion is to include this possibility in the model. 23. The Wierzbicki et al papers of 2008 (Cell) and 2009 (Nature Genetics), which were the first to identify Pol V transcripts and show that AGO4-siRNA complexes are recruited to the transcripts, also were the first to show that two of the three proteins that were later found to associate within the socalled DDR complex are required for Pol V transcript production. These papers should be cited when discussing the role of DDR complex proteins in Pol V transcription.
Minor comments: v The authors claim that Pol V has 'inferior' RNA extension ability compared to Pol II in both the abstract and the discussion. Please provide references to support this claim. v In the abstract, the authors stated that 'The KOW5 domain of KTF1 binds near the RNA exit channel of Pol V, where it can recruit Argonaute proteins to initiate assembly of DNA methylation machinery'. The latter part of this statement is speculation, and not a primary result, and should be stated as such. v The manuscript uses a great many abbreviations, and these should be spelled out the first time and abbreviated thereafter. For example, Pol V, should be named Nuclear RNA polymerase V or Nuclear DNA-dependent RNA polymerase V. v There is no marker/ladder in Fig. S1D. What is the method for determining that the primary RNA products in the autoradiogram are approximately 45 nt in length? In addition, please label the position of RNA primer in the autoradiogram.

Typos
The manuscript has many typographical errors that will need a copy editor to correct, including (but not limited_ to: • Materials and Methods/ Plasmids: pLZ009-GEP-NRPE1 should be pLZ009-GFP-NRPE1? • p.18, line 9 from bottom: "... the preliminary unreported dataset. subjected to 2D classification (N=30, iterations=25)." Is a broken and incomplete sentence. Please fix this, adding the information on low-pass filtering applied to the auto-picking template (ie. 2D class averages of particles from a preliminary unreported dataset, if understood correctly). Also, the auto-picking template info should be included in the Fig. S2A flowchart for clarification (at "RELION auto-picking Extract 2D classification").
Reviewer #2 (Remarks to the Author): In this manuscript, Zhang et al. reported the cryo-EM structure of the plant RNA polymerase V as a transcription elongation complex (TEC) associated with elongation factor KTF1 and SPT4 (in visible). The study provided the first Pol V structure. Structural comparison of Pol II and Pol IV revealed unique structural features of Pol V that are distinct from other RNA polymerases and explained some of the Pol V-specific roles, including the RNA-extension and DNA-translocating activities, the lack of interaction with general transcription factors. While I support publication of this study, I strongly suggest that the manuscript should be largely improved and polished before publication.
Major concerns: 1. The manuscript mainly discussed structure features of Pol V, which is fine because of the nature of the study. However, it should also be better to integrate the structural finding in this work and previous genetic and biochemical studies, making the manuscript more informative to broad readers, instead of just structural biologist at transcription field. The authors may cite and discuss previous results, especially those related to the major findings in this work. 2. From the data presented in the manuscript, it's not easy to evaluate the quality of cryo-EM map, especially the critical residues discussed in the main text. These regions should be shown in cartoon/stick that well-coved by corresponding cryo-EM maps, indicating that the structural model was correctly built. The figures could be shown in supplementary figures. Otherwise, discussion at the molecular level may not be supported by the experimental data. It is still fine to describe and discuss these regions by docking structural modules (predicted or from other structures) into the cryo-EM maps, which should be clarified in the manuscript.
Minor concerns: 3. The manuscript seems not to be well-prepared. For example, should it be Pol IV-RDR2 in "we have determined the three-dimensional cryo-EM structures of Pol V-RDR2 complex and reported its unique two-RNA polymerase architecture as well as its unprecedent 'backtracking-triggered interpolymerase RNA channeling' mechanism.". The following sentence, "However, due to the lack of structural information of Pol V is available" seems to be "However, due to the lack of structural information of Pol V". A number of typos could be found throughout the manuscript. The authors should carefully check the manuscript and make correction. 4. Introduction section, "In plants, ……" described DNA methylation pathway by Pol IV, which is not directly related to the major focus of this study. The authors may shorten this part and add more description of Pol V if necessary.
Reviewer #3 (Remarks to the Author): In this paper, Zhang et al. report the cryo-EM structure of a transcription elongation complex of Arabidopsis thaliana RNA polymerase V, which is involved in the RNA-directed DNA methylation pathway in plants. Although Pol V shows a similar structure to Pol II, the DNA-binding cleft of the Pol V TEC is much wider than that of the Pol II TEC. The structure also showed the KOW5 domain of KTF1 bound near the RNA-exit channel, similar to the corresponding domain of the Pol II elongation factor Spt5. The unique structural and sequence features of Pol V may explain the incompatibility of Pol V with the Pol II initiation and elongation factors. Overall, this work is technically sound, and has revealed a novel structure of an important enzyme complex. However, there are several concerns that should be addressed. 3. It is written, "Arabidopsis KTF1 contains an NGN domain, three KOW domains (KOW1/4/5)". This reviewer is interested in whether the KOW domains 2, 3 and X are missing in KTF1. According to the AlphaFold prediction, there could be at least one more KOW domain present. For the KOW domains, the authors may refer to the mammalian Pol II-DSIF complex (doi:10.1038/nsmb.3465) or more recent yeast Pol II-elongation factor complexes (doi:10.1126/science.abp9466), as they exhibit more complete KOW domains.
Minor points: 1. Overall, there are many typographical and grammatical errors. Especially, typos concerning the names of Pol IV/Pol V are quite confusing. These should be carefully checked and corrected. For example: "In our previous work, we have determined the three-dimensional cryo-EM structures of Pol V-RDR2 complex" "NTP incorporation efficiency and processivity than that of Po IV" "In summary, our structure shows detailed interface between Po V" 2. In Fig. S1B, the Pol V preparation contains a lot of impurities. Is it pure enough for the transcription assays or structural analysis? Were the impurities removed after the complex formation and the gel filtration for the structural analysis? Please provide a chart and an SDS-PAGE for the gel filtration step. Also, did the authors observe stoichiometric binding of KTF1?
3. In Fig. S1D, the RNA band positions do not appear to match the sequence diagram on the left side.
4. In the method section, it is written, "The AlphaFold2-predicted structure model of Pol V was used as the start model for model building". What about the KTF1 part? Is it also made by Alphafold2? Table S2 300 keV should be 300 kV.

Reviewer #1 (Remarks to the Author):
Using cryo-EM single-particle analysis, this important study by Zhang/Huang/Gu et al. shows the first structure of plant-specific nuclear DNA-dependent RNA Polymerase V (Pol V) bound by Kyrpides-Ouzounis-Woese (KOW) domain-containing transcription factor 1 (KTF1; also known as SUPPRESSOR OF TY INSERTION 5-LIKE, SPT5L), together with a nucleic acid scaffold composed of RNA, template DNA, and non-template DNA to mimic the RNAs of an elongation complex. The overall architecture of Pol V with the DNA/RNA scaffold is very similar to that of Pol II. Unique amino acid compositions in catalytic motifs such as the bridge helix and trigger loop may explain characteristics of Pol V transcription activity reported previously by others. The Pol V-specific N-terminal extension of the 5th subunit (NRPE5) contributes to the unique interaction interface with NRPE1 (the largest subunit), explaining the subunit and polymerase-specificity of the interaction. Also, the identification of the KOW5 domain of KTF1 in the cryo-EM map confirms that its binding location is similar to that of SPT5 when interacting with Pol II. A few Pol V-KTF1 specific differences in the interacting amino acids may account for the Pol V specificity of the interaction. The KTF1-KOW5 domain is located near the RNA exit channel, which would place Glycine-Tryptophan (GW)-rich repeats of KTF1 in the vicinity of Pol V nascent transcripts. These GW repeats of KTF1 and the GW repeats of the Pol V C-terminal domain were shown to be genetically redundant by Lagrange and colleagues in 2016 (see citation information below) and serve as so-called AGO hooks for binding AGO4. Moreover, a new paper by Wang et al. (see citation information below) has shown that ARGONAUTE 4 (AGO4) slices and retains its target RNAs in vivo and in vitro, with prior studies showing that Pol V is the source of the target RNAs in vivo. Collectively, these findings, which are not currently cited, together with the finding that AGO4 can physically interact with DOMAINS REARRANGED METHYLTRANSFERASE 2 (DRM2) (Zhong et al., cited as reference 39 in the current manuscript) support the model the authors present at the end of the paper. The new cryo-EM structure provides the structural context for this model and a basis for future studies exploring how additional proteins associate with the complex to bring about DNA methylation.

Reply:
We thank the referee for the encouraging comments and recommendation. The mentioned references have been cited in the revised manuscript.
Major comments/critiques: Q1. The term "transcription elongation complex (TEC)" for this study needs to be clarified. The method for "TEC" assembly is derived from other studies with well-characterized RNAPs such as Pol II. But no RNA synthesis and RNA elongation by nucleotide incorporation is demonstrated for the assembled RNA-DNA scaffold, thus it is not clear that the complex is truly an elongation complex in which the synthetic RNA can be extended.
Reply: Thanks for the comment. As pointed out by the referee that the method for assembly of "Pol V TEC" in our study has been used in structural studies of transcription elongation complexes of well-characterized RNAP, including bacterial RNAP, Pol I, Pol II, and Pol III. The pre-melted DNA-RNA scaffold restrains RNAP in a pre-defined translocation state that maximally reduces the conformational heterogeneity and permits structure determination at high resolutions. However, the pre-melted DNA-RNA scaffold is not ideal for testing the activity of elongating RNA as the inability of rewinding the upstream DNA hinders forward translocation of RNAP. Therefore, we employed another DNA-RNA scaffold with a fully complementary non-template and template DNA to show that the purified Pol V sample can extend RNA (Supplementary Fig. 1d).

Q2.
Is KTF1 truly an elongation factor? A study in 2009 by He et al, (An effector of RNA-directed DNA methylation in Arabidopsis is an ARGONAUTE 4-and RNA-binding protein. Cell 137: 498-508) identified a KTF1 mutant they called DMS3. They examined Pol V transcript levels at several loci in dms3/ktf1 mutants and found that Pol V transcript levels actually increased in the mutant, not decreased as expected if DMS3/KTF1 is important for transcript production. Pol V transcript levels are also increased in ago4 mutants, probably because AGO4 slicing of the transcripts does not occur. So, KTF1 function may only be for AGO4 recruitment, without affecting Pol V transcription at all. The authors should discuss this and perform an experiment to test whether KTF1 has any effect on Pol V transcription elongation in vitro. KTF1 is SPT5-LIKE, not SPT5 itself, so it may not have the same function as SPT5, as the authors assume.
Reply: Thanks for the comment. We have renamed KTF1 as a AGO4/6-recruiting factor in the revised manuscript. We agree that it is interesting to know whether KTF1 affects transcription elongation (i.e. NTP addition, RNAP translocation, pause, backtracking) beyond its recruitment function, the proposed experiments are expected to take at least two months (one month for ordering radiochemical NTPs in China and an additional month for the experiments), but unfortunately we don't have enough time. We are afraid of losing novelty of our story due to the recently published paper of Pol V elongation complex (PMID: 36893216). The suggested experiments will be performed and summarized in a separate manuscript. We have included below words in the introduction section of the revised manuscript.
"It is intriguing that Pol V transcript levels were barely affected in the rdm3 mutant 28 , raising the possibility that KTF1 might not affect the elongation of Pol V but simply functions as a AGO4/6-recruting factor."

Q3.
A catalytic mutant of NRPE1 should ideally have been used as a control in the transcription assays to prove that the elongation products in Fig. S1D are made by Pol V and not by a contaminating RNA polymerase (e.g., Pols I, II, and III). Including α-amanitin in the transcription assay would at least suppress Pol II's activity, and this is an easy experiment to include.
Reply: Thanks for the suggestion. Our method for Pol V purification includes three steps, an affinity-purification chromatography step, an ion-exchange chromatography step, and a sizeexclusion chromatography step, which maximally reduced the chance of contamination of other polymerases. The mass spectrometry result did not detect the largest subunits of other polymerases, suggesting no contamination of Pol I, II, and III. Moreover, the calculated map of Pol V doesn't show any signal for the foot domain of Pol II-NRPB1 subunit and additional subunits of Pol I and Pol III. In summary, we are confident that the extensively purified Pol V doesn't contain any contamination of other three RNA polymerases. It is nice to have an experiment with either αamanitin or a catalytic-mutant NRPE1 as the control, but we do not have time to perform the experiments due to the reason explained above.

Reply:
We assigned the H-bonds based on the distance (<=3.5 Å), geometry, as well as the map quality. We assigned van der Waals interactions mainly based on the distance (<=4.5 Å). The ones in the model show reasonable distance and geometry but with poor map were not included. We have rechecked the model, corrected the distance and geometry for those confident H-bonds, and removed the residues with poor map signals in the contact list in the revised manuscript.

Q12.
It is intriguing that the purified Pol V from the transgenic T87 Arabidopsis cells has NRPE5c (AT3G54490) exclusively (according to the LC-MS/MS, Table S1), and not NRPE5a (AT3G57080) that have been detected as the predominant Pol V 5th subunit in other studies from multiple labs*. In this work, the cryo-EM single-particle analysis was able to detect subunits that were not found in LC-MS/MS. Because the structural basis of NRPE5-specific Pol V association is a major part of this manuscript, the identity of the 5th subunit in the reconstruction would need to be clarified Reply: Thanks for pointing out the issue. We rechecked the mass spectrometry results carefully and identified that one of the proteins without annotation is NRPE5a, suggesting that both NRPE5a and NRPE5c are present in the epitope-tagged Pol V. However, our current map could not distinguish the two subunits. Due to NRPE5a is more relevant to physiological function, we have replaced NRPE5c with NRPE5a in our revised structure model. The text was also modified as follows, Major concerns: 1. The manuscript mainly discussed structure features of Pol V, which is fine because of the nature of the study. However, it should also be better to integrate the structural finding in this work and previous genetic and biochemical studies, making the manuscript more informative to broad readers, instead of just structural biologist at transcription field. The authors may cite and discuss previous results, especially those related to the major findings in this work. Reply: Thanks for your suggestion. We have integrated the structural findings and previous genetic and biochemical studies in the following paragraphs, In the paragraph for describing the interactions between Pol V and DNA/RNA, we added, "In summary, the Pol V TEC structure show that the active-site cleft of Pol V retains similar interactions with the RNA-DNA hybrid as Pol II, conferring its capacity of RNA extension and DNA translocation ( Supplementary Fig. 1d), consistent with the finding the catalytic activity is essential for the function of Pol V in RdDM pathway 16 ." In the paragraph describing the structural motifs in the active-site cleft of Pol V, we stated, "In short, the sequence and structural comparison of the key structure motifs (bridge helix and trigger loop) in the active site of Pol V, Pol IV, and Pol II provides structural explanation for the differences of transcription activities of the three RNA polymerases." In the paragraph describing the wide-open DNA main cleft, we stated, "It is unclear how the open conformation of Pol V is related to the function of Pol V, but it might partially account for the slower RNA elongation rate of Pol V creating time windows for AGO4/6 recruitment during Pol V transcription elongation." Moreover, we use an entire section to describe the structural explanation of the unique function of Pol V by showing that Pol V loses interface for the general transcription factors of Pol IV and Pol II. In the 'Discussion' section, we discuss the implication of the KTF1-KOW5 interaction and provide a model for the Pol V-centered DNA methylation pathway.
2. From the data presented in the manuscript, it's not easy to evaluate the quality of cryo-EM map, especially the critical residues discussed in the main text. These regions should be shown in cartoon/stick that well-coved by corresponding cryo-EM maps, indicating that the structural model was correctly built. The figures could be shown in supplementary figures. Otherwise, discussion at the molecular level may not be supported by the experimental data. It is still fine to describe and discuss these regions by docking structural modules (predicted or from other structures) into the cryo-EM maps, which should be clarified in the manuscript.